Validation

This page shows how different datasets (for 2024) perform at reproducing various official statistics when used with the PolicyEngine US microsimulation model.

Note that the Enhanced CPS dataset is explicitly calibrated to these official statistics, so it is expected to perform well. Since these statistics are large in number and diverse, we expect this to improve the dataset’s performance at predicting reform impacts.

This is the init_notebook_mode cell from ITables v2.1.5
(you should not see this message - is your notebook trusted?)
name actual estimate_cps estimate_puf estimate_ecps abs_rel_error_cps abs_rel_error_puf abs_rel_error_ecps ecps_abs_rel_error_change_over_cps ecps_abs_rel_error_change_over_puf ecps_abs_rel_error_change_over_prev_best
Loading ITables v2.1.5 from the init_notebook_mode cell... (need help?)

Overall, the ECPS outperforms the Census’ CPS in 88.0% of the targets and the IRS’ PUF in 86.9% of the targets.

The below histogram shows the distribution of ‘relative error change under the ECPS’, comparing each metric’s ECPS performance to the best of either the CPS or the PUF.